GenomeScope: fast reference-free genome profiling from short reads
نویسندگان
چکیده
Summary GenomeScope is an open-source web tool to rapidly estimate the overall characteristics of a genome, including genome size, heterozygosity rate and repeat content from unprocessed short reads. These features are essential for studying genome evolution, and help to choose parameters for downstream analysis. We demonstrate its accuracy on 324 simulated and 16 real datasets with a wide range in genome sizes, heterozygosity levels and error rates. Availability and Implementation http://genomescope.org , https://github.com/schatzlab/genomescope.git . Contact [email protected]. Supplementary information Supplementary data are available at Bioinformatics online.
منابع مشابه
Reference-free prediction of rearrangement breakpoint reads
MOTIVATION Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one...
متن کاملAn Ultra-fast Approach to Align Longer Short Reads onto Human Genome
With the advent of second-generation sequencing (SGS) technologies, deoxyribonucleic acid (DNA) sequencing machines have started to produce reads, named as “longer short reads”, which are much longer than previous generation reads, the so called “short reads”. Unfortunately, most of the existing read aligners do not scale well for those second-generation longer short reads. Moreover, many of th...
متن کاملImpact of Gene Annotation on RNA-seq Data Analysis
RNA-seq has become increasingly popular in transcriptome profiling. One of the major challenges in RNA-seq data analysis is the accurate mapping of junction reads to their genomic origins. To detect splicing sites in short reads, many RNA-seq aligners use reference transcriptome to inform placement of junction reads. However, no systematic evaluation has been performed to assess or quantify the...
متن کاملA fast and efficient algorithm for mapping short sequences to a reference genome.
Novel high-throughput (Deep) sequencing technology methods have redefined the way genome sequencing is performed. They are able to produce tens of millions of short sequences (reads) in a single experiment and with a much lower cost than previous sequencing methods. In this paper, we present a new algorithm for addressing the problem of efficiently mapping millions of short reads to a reference...
متن کاملMapping Accuracy of Short Reads from Massively Parallel Sequencing and the Implications for Quantitative Expression Profiling
BACKGROUND Massively parallel sequencing offers an enormous potential for expression profiling, in particular for interspecific comparisons. Currently, different platforms for massively parallel sequencing are available, which differ in read length and sequencing costs. The 454-technology offers the highest read length. The other sequencing technologies are more cost effective, on the expense o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 33 14 شماره
صفحات -
تاریخ انتشار 2017